Accuracy of Veterans Affairs databases for diagnoses of chronic diseases.

Introduction Epidemiologic studies usually use database diagnoses or patient self-report to identify disease cohorts, but no previous research has examined the extent to which self-report of chronic disease agrees with database diagnoses in a Veterans Affairs (VA) health care setting. Methods All veterans who had a medical care visit from October 1, 1996, through May 31, 1998, at any of the Veterans Integrated Service Network 13 facilities were surveyed about physician diagnosis of chronic obstructive pulmonary disease (COPD)/asthma, depression, diabetes, and heart disease. Four administrative case definitions (data from VA databases) consisting of combinations of International Classification of Diseases, Ninth Revision, codes and disease-specific medication data were compared with self-report of each disease to assess sensitivity, specificity, positive and negative predictive values, area under receiver operating characteristics curve, and κ statistic. Results Sensitivity for administrative definitions compared with self-report of physician diagnosis was 24% to 54% for COPD/asthma, 25% to 47% for depression, 27% to 59% for heart disease, and 64% to 78% for diabetes. Specificity was 88% to 100% for all diseases. The κ statistic showed fair agreement for COPD/asthma, depression, and heart disease and substantial agreement for diabetes. Conclusion Diagnoses identified from databases agree with self-report for diabetes but not COPD/asthma, depression, or heart disease in a VA health care setting.


Introduction
Large epidemiologic or health services studies usually resort to using administrative or clinical databases (eg, Medicare, Medicaid, Veterans Affairs, health maintenance organizations) or patient self-report (eg, data from the Behavioral Risk Factor Surveillance System) for case detection. Few studies, however, have examined the extent to which self-reported diagnoses agree with those obtained from databases. Self-reported diagnoses have good agreement with those obtained from databases for hypertension (1) and diabetes (2), but self-reported (3)(4)(5)(6)(7)(8) and databasederived (9,10) data have variable rates of agreement with diagnoses obtained from medical records review and physical examination. Two studies that combined diagnoses from databases with prescription information found that the combination was more accurate than either method alone for hypertension (1) and diabetes (11).
The Veterans Health Administration is the largest integrated health care system in the United States. One Veterans Affairs (VA) study compared diagnostic accuracy in veterans with serious mental illness and found that they are less aware of comorbidities (12), but to my knowledge, no previous study has examined the extent to which self-report of chronic diseases agrees with diagnoses from databases in a VA health care setting. Veterans are sicker and have more comorbidites than do The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the US Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above. age-matched Americans in general (13), and since comorbidity is associated with less accuracy of self-report (4), rates of agreement may be lower and predictors different in a veteran cohort than in the general population. In the general population, younger age, better cognition, more education, less comorbidity, female sex, being married, and more frequent use of medical services are associated with more accurate self-report (4,5,7,(14)(15)(16). Alternatively, a small social network, major depression, recent alcohol abuse, and serious mental illness are associated with less accurate self-report (6,12).
In a study of quality of life of veterans receiving health care in Veterans Integrated Service Network 13 (VISN-13), data were collected regarding patient self-report of chronic diseases, administrative diagnoses, and use of medications (17). These data were used to examine the extent to which patient self-report of physician diagnosis agrees with data obtained from administrative databases and whether patient demographic, clinical, or functional parameters affect the agreement.

Veterans' Quality of Life Study
The Veterans' Quality of Life Study was a cohort study of all veterans who received inpatient or outpatient health care at any of the VISN-13 facilities (covering all of Minnesota, North Dakota, and South Dakota and selected counties in Iowa, Nebraska, Wisconsin, and Wyoming) from October 1, 1996, through May 31, 1998, and had a valid mailing address (17). This cohort of veterans was mailed a survey, and a repeat mailing was sent to nonresponders. The survey response rate was 58% (40,508 of 70,334 eligible veterans).
The survey included questions regarding 1) self-report of physician diagnosis of chronic conditions, including chronic obstructive pulmonary disease (COPD) or asthma, depression, diabetes, heart disease, hypertension, and arthritis; 2) demographic information, including sex, education level, and race/ethnicity; 3) smoking status; and 4) functional limitation as assessed by limitation of activities of daily living, such as bathing, dressing, eating, getting in and out of a chair, walking, and using the bathroom (18). The survey also included the SF-36V (Short Form Health Survey for Veterans) (19), which consists of 8 subscales: physical functioning, bodily pain, general health, vitality, mental health, social functioning, role emotional, and role physical (role limitations due to emotional or physical problems, respectively). Physical and mental component summary scores of SF-36V were generated from the 8 subscales, standardized to the US population, and normbased; possible values ranged from 0 to 100, and higher scores corresponded to better health.
Prospective and retrospective cohort data were obtained for the year before and the year after the survey from the Patient Treatment File and the Outpatient Clinic data sets in the VA administrative databases at the Austin Automation Center in Austin, Texas. These data are reliable for demographic characteristics and most common diagnoses (20) and valid for specific diagnoses (9,10). Data extracted included demographics (age, marital status, employment status, and percentage service connection) and health care use. A veteran is considered "service connected" if he or she has disabilities resulting from or beginning during active military duty, and veterans with a service connection of 50% or higher get priority access to VA care. Data regarding number of inpatient hospitalizations and outpatient visits in primary care, specialty medical care, surgical care, and mental health were extracted and categorized according to stop codes.

Validation study
In addition to the above data, for this study International Classification of Diseases, Ninth Revision (ICD-9), codes and prescription data for 4 self-reported comorbidities (COPD/asthma, depression, diabetes, and heart disease), and Current Procedural Terminology codes for percutaneous transluminal coronary angioplasty and coronary artery bypass grafting only for heart disease, were extracted from each facility for the 2-year period, including the year before and the year after the survey ( Table  1). The pharmacy database (VISN-13 Veterans Health Information Systems and Technology Architecture) was searched for 2 or more refills of the prescriptions specific to each condition that were available in the VA pharmacy during the 2-year period of this search. Only disease-specific medications were searched rather than all medications, since this strategy was intended to be specific (and not sensitive) for case detection. From the pharmacy and ICD-9 code information, the following 4 database case definitions were formulated for each disease and compared with self-report of physician diagnosis: ICD-9 code, The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the US Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above. medication use, ICD code or medication use, and ICD-9 code and medication use.

Statistical analyses
The accuracy of various administrative database case definitions was calculated for each disease by comparing them with patient self-report of physician diagnosis for each condition. The agreement between database case definitions and self-report was assessed by calculating the κ statistic (21). The measures of accuracy included sensitivity, specificity, positive and negative predictive values, and area under the receiver operating characteristics curve. Sensitivity analyses were performed by considering the 4 administrative case definitions for the year before the survey or the 2-year period including 1 year before and 1 year after the survey. Sensitivity was defined as the fraction of patients who reported physician diagnosis of a condition that was correctly identified as positive for that condition by each administrative database case definition (ICD only, medication only, ICD or medication, ICD and medication). Specificity was defined as the fraction of those who did not report diagnosis of a condition that was correctly identified as negative for the condition by each database case definition. Positive (or negative) predictive values were the fraction of cases with positive (or negative) data definitions (those with both self-reported physician diagnosis and the case definition or with neither) among all patients with (or without) data definition. Results are presented for the definitions that included the 1-year period (before the survey), since they did not differ substantially from those that included the 2-year period.
Multivariable logistic regression analyses were performed to determine the factors significantly associated with disagreement between self-report and administrative database definition for various chronic diseases. To avoid multiple analyses, the database case definition of ICD-9 code in the year before the survey was used. This definition was chosen for multiple reasons: 1) ICD-9 code is frequently used for case detection in large epidemiologic studies, 2) ICD-9 code is easy to extract from most large databases, and 3) this administrative case definition was associated with the most agreement (highest κ statistic) with the self-report case definition in most instances. The year before the survey was chosen for the definition because patients can report a disease only if they were told of the diagnosis by their physician before the survey (and not after the survey). Various predictor factors that were modeled in these regression analyses included demographics; clinical measures; health care use, access, and eligibility measures; and health and functional status. For the purpose of analysis, outpatient visits, physical component summary, and mental component summary scores were divided into tertiles. All participants were included in the main logistic regression analysis, and differences were considered significant at P < .05.

Results
Participants with various conditions had similar outpatient and inpatient use ( Table 2). More patients with depression reported at least 1 mental health visit, service connection, and unemployment than did patients with other conditions. Smoking was more prevalent in patients with COPD/asthma and depression than in patients with diabetes or heart disease.
Fair-to-moderate agreement for COPD/asthma, depression, and heart disease and substantial agreement for diabetes were found (Table 3). In general, κ statistics were the highest when the most inclusive administrative case definition was examined (either ICD-9 code or medication use) and lowest when the strictest definition was considered (both ICD-9 code and medication use). κ statistics were similar when administrative data definitions during a 2-year period were considered rather than the prior year (data not shown).
Sensitivity and negative predictive value were highest for the administrative case definition of ICD-9 code or medication use, and specificity and positive predictive value were highest for the administrative case definition of ICD-9 code and medication use (Table 3). For example, for diabetes, the sensitivity and positive predictive values were, respectively, 76% and 91% for the ICD-9 code definition (most sensitive definition), which implies that 76% of patients who reported physician-diagnosed diabetes had an ICD-9 code for it, and 91% of those with an ICD-9 code for diabetes could correctly identify their diagnosis. Similarly, the specificity and positive predictive values of 100% and 98% for ICD-9 code and medication use definition for diabetes (most specific definition) indicates that all patients who reported absence of physician-diagnosed diabetes lacked an ICD-9 code and diabetes medication prescription for it and that 98% of those who did not have an ICD-9 code or diabetes medication prescription could Centers for Disease Control and Prevention • www.cdc.gov/pcd/issues/2009/oct/08_0263.htm correctly identify the absence of a diabetes diagnosis. Results were similar when administrative data definitions during a 2-year period were considered instead of the 1year period (data not shown).
Lower number of outpatient visits, higher number of comorbidities, and lower physical component summary score were associated with higher odds of disagreement for most chronic diseases (Table 4). Some factors had opposite effects on disagreement in different diseases; older age, for example, was associated with 10% to 20% lower odds of disagreement for diagnosis of depression but 90% to 190% higher odds of disagreement for diagnosis of heart disease. Sex was not associated with disagreement between self-report and administrative database definitions for any disease.

Discussion
This study of elderly veterans in VISN-13 found fairto-moderate agreement between administrative definitions and self-report of COPD/asthma, heart disease, and depression and substantial agreement for diabetes. High κ and positive predictive values for administrative database definitions versus self-report for diabetes confirm similar earlier findings of κ 0.70 to 0.93 (3,(22)(23)(24)(25)(26)(27)(28) and positive predictive value of 77% to 94% (3,23,29). The present study extends these findings to VA databases. This study differs slightly from previous studies in terms of comparison of self-report to databases in this study versus medical chart documentation (3,22,24) or physical examination findings (23). The study most similar in design to this one (2) found κ of 0.81 between Medicaid claims data and self-reported diabetes in a sample of 2,154 adult Medicaid recipients in Oregon (2).
The finding of a much higher level of agreement between self-report and administrative database diagnosis of diabetes as compared with COPD/asthma, depression, and heart disease confirms similar previous findings in heart disease (3,23,28), COPD (28), and depression (23). It also supports the assertion that if a disease is conceptually clear (for example, diabetes), severe, or persistent, it is easily communicated by the doctor to the patient (23). In addition, ambiguity of some survey questions, differences in patient knowledge and perception by disease, and specificity of medications for particular diseases may have contributed to differences in level of agreement. For example, the question regarding heart disease asked patients about "myocardial infarction, heart attack, or heart problems including angina," which may not be as easily understood by patients as the question about diabetes.
That higher number of comorbidities and older age increased discrepancy (decreased κ) between self-report and database diagnoses confirmed an earlier finding of lower κ between self-report and medical records-based algorithms in women aged 65 years or older (4), in a random sample of Olmsted County residents (30), and in a representative sample of Finnish residents aged 45 to 73 years (3) and is in contrast to findings in a study of patients with end-stage renal disease (28). The studies differ in that self-report was compared with database diagnoses in this study and with medical records-based algorithms (4) or physician diagnosis and medical record (3,28,30) in the other studies.
Findings of more disagreement in nonwhite and less educated patients confirm similar findings (8). Lower physical or mental health status and being nonwhite were associated with higher odds of disagreement for the chronic conditions. This finding may be secondary to increased recall bias in these groups. For COPD/asthma, being a smoker was associated with 60% higher odds of disagreement, which may be secondary to underreporting of COPD/asthma by smokers because of denial or overdocumentation of COPD/asthma diagnosis by physicians.
More physician visits are associated with more disagreement between self-report and medical record evidence for cardiovascular diseases (3). In the present study, increased outpatient use was associated with lower discordance for heart disease, which may be secondary to more effective patient-physician communication.
This study has several limitations. The nonresponse bias and cohort characteristics (elderly veterans, predominantly male and white) may limit the generalizability. However, these data should be useful to VA epidemiologists who use computerized databases. Shortcomings in the questionnaire design may also have influenced the level of agreement, as previously described (24). On the other hand, use of more specific questions (such as asking about coronary artery disease) may lead to even more disagreement because lay people may not be familiar with the vocabulary. For epidemiologic studies, neither self-report nor diagnosis from databases are standards, but they are The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the US Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.
the most common methods for identifying cohorts. The validity was examined in only 1 VISN of the VA system and may not reflect coding practices for the entire VA.
Since the data are more than 10 years old, some codes or coding practices may have changed or VA data sets may be more complete or accurate now. Finally, patients who participated in this study may have been different from the general VA population, which would introduce selection bias.
This study also has several strengths. The sample was large, and results were robust across database definitions, including various combinations of ICD-9 codes and prescription of disease-specific medication. The self-report definition in this study is, in fact, self-report of physician diagnosis, which is more accurate than self-report alone.
These findings also have clinical implications. The finding that 89% to 91% of elderly veterans with COPD/ asthma, depression, or heart disease who are being treated for the condition (ICD-9 code plus medication) could identify their diagnosis implies that these veterans can be identified without accessing medical records and could be targeted for interventions at a community level, such as education on self-management, healthy lifestyles, and exercise and other nonpharmacologic interventions. These interventions may be even more relevant for patients with diabetes (exercise, weight reduction, foot care, and selfmonitoring), since 98% can identify their disease. In summary, agreement between self-report of physician diagnosis and database diagnoses differs by the diagnosis. Agreement is fair to moderate for COPD/asthma, heart disease, and depression and substantial for diabetes. The effect of patient demographic, clinical, health care use, and access measures underscores the limitation of common approaches that use patient self-report or administrative databases to identify disease cohorts. Further studies should develop algorithms to improve the methods of patient cohort selection.
The opinions expressed by authors contributing to this journal do not necessarily reflect the opinions of the US Department of Health and Human Services, the Public Health Service, the Centers for Disease Control and Prevention, or the authors' affiliated institutions. Use of trade names is for identification only and does not imply endorsement by any of the groups named above.